Continuous time Markovian decision processes average return criterion

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reinforcement Learning Algorithms for Average-Payoff Markovian Decision Processes

Reinforcement learning (RL) has become a central paradigm for solving learning-control problems in robotics and artificial intelligence. R L researchers have focussed almost exclusively on problems where the controller has to maximize the discounted sum of payoffs. However, as emphasized by Schwartz (1$X)3), in many problems, e.g., those for which the optimal behavior is a limit cycle, it is mo...

متن کامل

Bounded Parameter Markov Decision Processes with Average Reward Criterion

Bounded parameter Markov Decision Processes (BMDPs) address the issue of dealing with uncertainty in the parameters of a Markov Decision Process (MDP). Unlike the case of an MDP, the notion of an optimal policy for a BMDP is not entirely straightforward. We consider two notions of optimality based on optimistic and pessimistic criteria. These have been analyzed for discounted BMDPs. Here we pro...

متن کامل

Fuzzy Decision Processes with an Average Reward Criterion

As the same framework of Fuzzy decision processes with the discounted case we will specify an average fuzzy criterion model and develop its optimization by “fuzzy max order” under appropriate conditions. The average reward is characterized, by introducing a relative value function, as a unique solution of the associated equation. Also we derive the optimality equation using the “vanishing disco...

متن کامل

Continuous time Markov decision processes

In this paper, we consider denumerable state continuous time Markov decision processes with (possibly unbounded) transition and cost rates under average criterion. We present a set of conditions and prove the existence of both average cost optimal stationary policies and a solution of the average optimality equation under the conditions. The results in this paper are applied to an admission con...

متن کامل

Markov decision evolutionary games with time average expected fitness criterion

We present a class of evolutionary games involving large populations that have many pairwise interactions between randomly selected players. The fitness of a player depends not only on the actions chosen in the interaction but also on the individual state of the players. Players stay permanently in the system and participate infinitely often in local interactions with other randomly selected pl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Mathematical Analysis and Applications

سال: 1975

ISSN: 0022-247X

DOI: 10.1016/0022-247x(75)90063-3